General notes on project: - max of 15 pages

Executive Summary {#executive-summary} - Tonah

Introduction {#introduction} - Tonah

  • make sure to include citations - showing why this research matters (can you send me PDFs of the citations you include so I can include them in the final project doc)
    • create citations in the same way we did before in diabetes project

Methods

Data Summary /EDA {#eda} - Keana

Our dataset originates from the most recent (updated in 2015) version of the Canadian Nutrient File. The database contains average values for nutrients in foods available in Canada. These averages are based on the generic versions of a food, unless there is a brand specifically included in the database. This is a bilingual dataset with food names, descriptions, and background information that are in both French and English. One of the major goals in creating this version of the dataset was to update nutrient values for foods that are the largest contributors of sodium to the diet, since one of the major goals of manufacturers is to reduce sodium content of foods.

To this end, the database assesses more than 5690 unique foods, ranging from foods such as Cheese souffle to Vanilla extract, and provides the average nutrient levels per 100 grams.

Notably, some of the nutrients included are subcomponents of other nutrients. Or, like the two metrics for food energy (i.e., kcal and kJ), they are just different ways of measuring the same thing (i.e., one kcal = 4.184 kJ)

Also, it is worth noting that there are many values that are missing. For instance, only 0.982% of the rows have values for biotin (see Missing values in clean dataset section)

The original dataset was a relational dataset, with unique identifiers for the main variables of interest. Therefore, we had to merge on the unique identifiers (e.g., FoodID) across the relational datasets using the left_join function. Then, we

The original dataset was cleaned for analyses in the following ways:

(see here for a full summary of the variables in the cleaned dataset). All of the subsequent summary statistics and analyses mentioned in the paper will be using the clean dataset.

notes from here: chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/file:///C:/Users/keana/OneDrive%20-%20PennO365/Comp_transfer2018/Penn/fourth_yr/nutrients_project_stat571/cnf-fcen-csv/CNF%202015%20users_guide%20EN.pdf

description of file contents: From FOOD NAME.csv: FoodID (merging var) FoodGroupID (merging var) FoodDescription From NUTRIENT NAME.csv: NutrientID (merging var) NutrientName NutrientUnit (possibly as control??) From NUTRIENT AMOUNT.csv: NutrientValue FoodID (merging var) NutrientID (merging var) From FOOD GROUP.csv FoodGroupID (merging var) FoodGroupName

Main file of interest is NUTRIENT AMOUNT.csv – contains long version of dataset, where there are multiple rows for each food, along with a col for the nutrient identifier and the nutrient value associated with that identifier

Will have to combine the multiple datasets in one to have all info in one place

“At present foods are grouped under 23 different group headings based on similar characteristics of the foods”

The foodnamesare only available in this version in one lengthwhich does not include abbreviationsand can be up to 255 characters long

  • All of the nutrient data is stored per 100g of the food (edible portion) - that is, all nutritional data is on the same scale

  • in cleaning - only selected variables of interest (e.g., we removed mean SE for the nutrient values, since there were many missing values)

Analyses {#analyses} - Jeesung (both describing the analyses that were run - aka methods and running the analyses)

  • Analytic Procedure:
  • Dealing with NAs: Option 1. mean imputation; Option 2. Removing variables with more than 50% of NAs + mean imputation of survived varaibles
  1. Regular Clustering
  • k-means
  • result report: wordcloud of each cluster based on most frequent names in food names
  • PCA to see nutrients with top loadings per cluster
  1. Spectrum Clustering
  • k-means using PCs
  • result report: wordcloud of each cluster based on most frequent names in food names

  • Further Idea: for better interpretation of data… theoretically narrow down nutrients of interest (e.g. Energy, Carb, Protein, Fat, Sugar, Vitamin, Cholestrol) and run the same analyses?

1. Data Cleaning

2.1 Option 1 to handle NAs: use mean imputed data

2.1.1 Regular K-means Clustering

Run k-means clustering

## [1] "Size of each cluster is" "9"                      
## [3] "21"                      "9"                      
## [5] "23"                      "97"                     
## [7] "1785"                    "3687"                   
## [9] "59"
Plotting : two random variables

###### Plotting : wordcloud

###### Mean calories per cluster

Most important nutrients (PCA)

2.1.2 Spectrum Clustering

First, Run PCA

##### Most important nutrients used for spectrum clustering (PC1, PC2) ###### Run k-means clustering

Plotting : two PCs (Note: all exploratory plots)

Plotting : wordcloud

###### Mean calories per cluster

Most important nutrients (PCA)

2.2 Option 2 to handle NAs: remove variables with more than 50% of NAs + apply mean imputation

2.1.1 Regular K-means Clustering

Run k-means clustering

## [1] "Size of each cluster is" "9"                      
## [3] "21"                      "9"                      
## [5] "23"                      "97"                     
## [7] "1785"                    "3687"                   
## [9] "59"
Plotting : two random variables

###### Plotting : wordcloud of each cluster

###### Mean calories per cluster

Most important nutrients (PCA)

2.1.2 Spectrum Clustering

First, Run PCA

##### Most important nutrients used for spectrum clustering (PC1, PC2) ###### Run k-means clustering

Plotting : two PCs (Note: all exploratory plots)

Plotting : wordcloud

###### Mean calories per cluster

Most important nutrients (PCA)

3. Theoretically Driven Nutrients & Spectrum Clustering

  • Choose the most crucial nutrients (https://www.medicalnewstoday.com/articles/326132) and run clustering based on them
  • Vitamin, minerals (magnesium,calcium, phosphorus,sulfur,socium,potassium,chloride,iron,selenum,zinc,magnaese,chromium,copper,iodine,fluoride,molybdenum), water, portein, carbohydrates, fats, energy(kcal)
First, Run PCA

##### Most important nutrients used for spectrum clustering (PC1, PC2) ###### Run k-means clustering

Plotting : two PCs (Note: all exploratory plots)

Plotting : wordcloud

###### Mean calories per cluster

Most important nutrients (PCA)

# Methods Parking Lot We will run spectrum clustering to see which foods are grouped together based on their nutrient profile. In doing so, we want to find out whether we can find clear-cut clusters of foods based on nutrition profile regardless of actual food group assignment. Our grouping results can be used to find the list of foods that contain the best combination of nutrients based on a person’s dietary needs. This improves upon diets that try to rely on the actual food grouping, since our grouping result will be a better representation of the nutrient profile of a set of foods than the food group label, which may be determined arbitrarily. Since we have large dimensions of data (153 nutrient information for 5690 unique foods), we are going to run principal component analysis (PCA) to

  • April 23rd first meeting << To Be Decided Together with Team>>
  1. How to deal with NAs? : want to run PCA and find the ideal number of PCs that will be used for spectrum clustering. can’t run prcomp with NAs.(if use na.omit – no single food will survive – recode all NAs to 0 for prelim analyses) – mean imputation

  2. prcomp : scale and center each column? – prcomp.default(nutrient_only, scale = T, center = T) : cannot rescale a constant/zero column to unit variance

  3. two Energy variables
  • only use either of two?;FYI, according to PCA, Energy is top loading nutrient of PC1
  • remove both? i like the idea of predicting average calories of each cluster that we will assign –> any recommended methods to do this? just report average calories of each cluster?

  • use tab_model or stargazer for showing regressions (see previous write ups for examples)

  • use silhouette method from lecture on clustering to decide on a number of clusters

  • just write the code to create the models & figures & Keana will incorporate the results into the results/appendix section - no need to write out the stat output from models - just a symmary of what they generally show & logic will be helpful

  • last time there was writing between the code - which made the formatting look a little weird (ie weird spacing between paragaphs where the code was). this time, let’s have all of the code at the end of a section so the formatting doesn’t get messed up

  • possibly use this package for assessing model fit (if included): https://github.com/easystats/performance What are the words in food description that predict the cluster they might be in?

Results - Keana

Conclusion {#conclusion} - Keana

chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/file:///C:/Users/keana/OneDrive%20-%20PennO365/Comp_transfer2018/Penn/fourth_yr/nutrients_project_stat571/cnf-fcen-csv/CNF%202015%20users_guide%20EN.pdf - The CNF is particularly suited for assessment of diets, recipe development, menu planning when ingredients or menu items are not specific and for population nutrition surveillance activities, where nutrient intake distributions areused to conduct risk assessments such as modeling for fortification proposals. It is also useful in the initial stages of product development to ensure that nutritional targets can be met.

Limitations {#limitations} - Keana

chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/file:///C:/Users/keana/OneDrive%20-%20PennO365/Comp_transfer2018/Penn/fourth_yr/nutrients_project_stat571/cnf-fcen-csv/CNF%202015%20users_guide%20EN.pdf - The exact nutrient composition of a specific apple or cookie isnot found on the CNF. These averages, except where indicated otherwise, take into account sources of a given food across Canada. Local foods may have a different profile than the national average. - Most users are lookingfor an average or mean value for a generic representation of the foods as described. These generic values have been derived from combining brands of similar products, for example all major brands of ketchup; various varieties of oranges or similar beef cuts from various producers.
- This dataset is only relevant to products available in Canada - so the results cannot be generalized to products from other countries. Therefore future research should explore whether these findings replicate among products in other countries. - the nutrient values are all standardized, not representative of how much a person may actually consume in a package - would need to convert to nutrient values for the actual portions people eat to be interpretable

Appendix {#appendix} - Keana

Full summary of clean dataset

Data Frame Summary

wide_data

Dimensions: 5690 x 154
Duplicates: 0

No Variable Stats / Values Freqs (% of Valid) Graph Missing
1 FoodGroupName
[factor]
1. Babyfoods
2. Baked Products
3. Beef Products
4. Beverages
5. Breakfast cereals
6. Cereals, Grains and Pasta
7. Dairy and Egg Products
8. Fast Foods
9. Fats and Oils
10. Finfish and Shellfish Pro
[ 13 others ]
94 ( 1.7%)
441 ( 7.8%)
170 ( 3.0%)
243 ( 4.3%)
212 ( 3.7%)
155 ( 2.7%)
241 ( 4.2%)
174 ( 3.1%)
144 ( 2.5%)
325 ( 5.7%)
3491 (61.4%)
0
(0.0%)
2 PROTEIN
[numeric]
Mean (sd) : 11.1 (10.8)
min < med < max:
0 < 7.6 < 85.6
IQR (CV) : 16.6 (1)
2261 distinct values 0
(0.0%)
3 FAT (TOTAL LIPIDS)
[numeric]
Mean (sd) : 10 (16.7)
min < med < max:
0 < 3.8 < 100
IQR (CV) : 11.6 (1.7)
1913 distinct values 0
(0.0%)
4 CARBOHYDRATE, TOTAL (BY DIFFERENCE)
[numeric]
Mean (sd) : 22 (26.5)
min < med < max:
0 < 10.3 < 100
IQR (CV) : 31.6 (1.2)
2756 distinct values 0
(0.0%)
5 ASH, TOTAL
[numeric]
Mean (sd) : 1.9 (3.5)
min < med < max:
0 < 1.2 < 99.8
IQR (CV) : 1.2 (1.8)
646 distinct values 1
(0.0%)
6 ENERGY (KILOCALORIES)
[numeric]
Mean (sd) : 219 (174)
min < med < max:
0 < 174 < 902
IQR (CV) : 240 (0.8)
665 distinct values 0
(0.0%)
7 ALCOHOL
[numeric]
Mean (sd) : 0.1 (1.8)
min < med < max:
0 < 0 < 42.5
IQR (CV) : 0 (14.7)
37 distinct values 325
(5.7%)
8 MOISTURE
[numeric]
Mean (sd) : 55 (31)
min < med < max:
0 < 64.7 < 100
IQR (CV) : 50.3 (0.6)
3417 distinct values 0
(0.0%)
9 CAFFEINE
[numeric]
Mean (sd) : 3.9 (101)
min < med < max:
0 < 0 < 5714
IQR (CV) : 0 (25.6)
61 distinct values 312
(5.5%)
10 THEOBROMINE
[numeric]
Mean (sd) : 7 (77.1)
min < med < max:
0 < 0 < 2634
IQR (CV) : 0 (11.1)
130 distinct values 338
(5.9%)
11 ENERGY (KILOJOULES)
[numeric]
Mean (sd) : 915 (727)
min < med < max:
0 < 727 < 3774
IQR (CV) : 1006 (0.8)
1658 distinct values 1
(0.0%)
12 SUGARS, TOTAL
[numeric]
Mean (sd) : 7.7 (15)
min < med < max:
0 < 1.3 < 99.8
IQR (CV) : 7.6 (1.9)
1505 distinct values 1046
(18.4%)
13 FIBRE, TOTAL DIETARY
[numeric]
Mean (sd) : 2.4 (4.8)
min < med < max:
0 < 0.8 < 79
IQR (CV) : 2.8 (2)
237 distinct values 224
(3.9%)
14 CALCIUM
[numeric]
Mean (sd) : 76.9 (220)
min < med < max:
0 < 24 < 7364
IQR (CV) : 60 (2.9)
468 distinct values 51
(0.9%)
15 IRON
[numeric]
Mean (sd) : 2.6 (5.6)
min < med < max:
0 < 1.1 < 124
IQR (CV) : 2.1 (2.2)
882 distinct values 51
(0.9%)
16 MAGNESIUM
[numeric]
Mean (sd) : 39.7 (64.8)
min < med < max:
0 < 21 < 781
IQR (CV) : 23 (1.6)
302 distinct values 214
(3.8%)
17 PHOSPHORUS
[numeric]
Mean (sd) : 168 (236)
min < med < max:
0 < 130 < 9918
IQR (CV) : 176 (1.4)
620 distinct values 153
(2.7%)
18 POTASSIUM
[numeric]
Mean (sd) : 308 (447)
min < med < max:
0 < 232 < 16500
IQR (CV) : 215 (1.5)
895 distinct values 165
(2.9%)
19 SODIUM
[numeric]
Mean (sd) : 333 (1219)
min < med < max:
0 < 82 < 38758
IQR (CV) : 339 (3.7)
1099 distinct values 43
(0.8%)
20 ZINC
[numeric]
Mean (sd) : 1.6 (3)
min < med < max:
0 < 0.8 < 91
IQR (CV) : 1.8 (1.9)
695 distinct values 220
(3.9%)
21 COPPER
[numeric]
Mean (sd) : 0.2 (0.6)
min < med < max:
0 < 0.1 < 15.1
IQR (CV) : 0.2 (2.8)
787 distinct values 270
(4.7%)
22 MANGANESE
[numeric]
Mean (sd) : 0.6 (3.7)
min < med < max:
0 < 0.1 < 133
IQR (CV) : 0.4 (6.1)
1226 distinct values 585
(10.3%)
23 SELENIUM
[numeric]
Mean (sd) : 14.6 (36.7)
min < med < max:
0 < 6.9 < 1917
IQR (CV) : 19.4 (2.5)
614 distinct values 722
(12.7%)
24 RETINOL
[numeric]
Mean (sd) : 88.8 (840)
min < med < max:
0 < 0 < 30000
IQR (CV) : 11 (9.5)
326 distinct values 499
(8.8%)
25 BETA CAROTENE
[numeric]
Mean (sd) : 292 (1711)
min < med < max:
0 < 0 < 42891
IQR (CV) : 33 (5.9)
612 distinct values 653
(11.5%)
26 ALPHA-TOCOPHEROL
[numeric]
Mean (sd) : 1.2 (4.1)
min < med < max:
0 < 0.3 < 149
IQR (CV) : 0.6 (3.5)
447 distinct values 1555
(27.3%)
27 VITAMIN D (INTERNATIONAL UNITS)
[numeric]
Mean (sd) : 23.9 (241)
min < med < max:
0 < 0 < 12716
IQR (CV) : 6 (10.1)
214 distinct values 692
(12.2%)
28 VITAMIN D (D2 + D3)
[numeric]
Mean (sd) : 0.6 (6.3)
min < med < max:
0 < 0 < 318
IQR (CV) : 0.2 (10)
129 distinct values 690
(12.1%)
29 VITAMIN C
[numeric]
Mean (sd) : 8.2 (53.2)
min < med < max:
0 < 0.1 < 1900
IQR (CV) : 3.6 (6.5)
458 distinct values 184
(3.2%)
30 THIAMIN
[numeric]
Mean (sd) : 0.2 (0.6)
min < med < max:
0 < 0.1 < 23.4
IQR (CV) : 0.2 (2.6)
812 distinct values 280
(4.9%)
31 RIBOFLAVIN
[numeric]
Mean (sd) : 0.2 (0.5)
min < med < max:
0 < 0.1 < 17.5
IQR (CV) : 0.2 (2.1)
709 distinct values 261
(4.6%)
32 NIACIN (NICOTINIC ACID) PREFORMED
[numeric]
Mean (sd) : 3.1 (4.4)
min < med < max:
0 < 1.6 < 128
IQR (CV) : 4.4 (1.4)
2828 distinct values 234
(4.1%)
33 TOTAL NIACIN EQUIVALENT
[numeric]
Mean (sd) : 5.2 (5.6)
min < med < max:
0 < 3.5 < 132
IQR (CV) : 7.2 (1.1)
3908 distinct values 234
(4.1%)
34 PANTOTHENIC ACID
[numeric]
Mean (sd) : 0.6 (0.9)
min < med < max:
0 < 0.4 < 21.9
IQR (CV) : 0.7 (1.5)
1316 distinct values 936
(16.4%)
35 VITAMIN B-6
[numeric]
Mean (sd) : 0.2 (1)
min < med < max:
0 < 0.1 < 68.8
IQR (CV) : 0.2 (4.4)
756 distinct values 397
(7.0%)
36 TOTAL FOLACIN
[numeric]
Mean (sd) : 37.7 (93.4)
min < med < max:
0 < 12 < 3786
IQR (CV) : 35 (2.5)
290 distinct values 408
(7.2%)
37 VITAMIN B-12
[numeric]
Mean (sd) : 1.1 (6.8)
min < med < max:
0 < 0 < 380
IQR (CV) : 0.7 (6.1)
899 distinct values 354
(6.2%)
38 VITAMIN K
[numeric]
Mean (sd) : 20.8 (99.9)
min < med < max:
0 < 1.7 < 1714
IQR (CV) : 6 (4.8)
434 distinct values 2516
(44.2%)
39 FOLIC ACID
[numeric]
Mean (sd) : 8.4 (49.1)
min < med < max:
0 < 0 < 2993
IQR (CV) : 0 (5.8)
160 distinct values 160
(2.8%)
40 TRYPTOPHAN
[numeric]
Mean (sd) : 0.1 (0.1)
min < med < max:
0 < 0.1 < 1.6
IQR (CV) : 0.2 (0.9)
458 distinct values 1835
(32.2%)
41 THREONINE
[numeric]
Mean (sd) : 0.5 (0.5)
min < med < max:
0 < 0.3 < 3.7
IQR (CV) : 0.8 (0.9)
1300 distinct values 1782
(31.3%)
42 ISOLEUCINE
[numeric]
Mean (sd) : 0.6 (0.5)
min < med < max:
0 < 0.4 < 5
IQR (CV) : 0.8 (0.9)
1369 distinct values 1778
(31.2%)
43 LEUCINE
[numeric]
Mean (sd) : 1 (0.9)
min < med < max:
0 < 0.7 < 7.2
IQR (CV) : 1.4 (0.9)
1834 distinct values 1782
(31.3%)
44 LYSINE
[numeric]
Mean (sd) : 0.9 (0.9)
min < med < max:
0 < 0.4 < 5.8
IQR (CV) : 1.6 (1)
1698 distinct values 1764
(31.0%)
45 METHIONINE
[numeric]
Mean (sd) : 0.3 (0.3)
min < med < max:
0 < 0.2 < 3.2
IQR (CV) : 0.5 (1)
859 distinct values 1767
(31.1%)
46 CYSTINE
[numeric]
Mean (sd) : 0.2 (0.2)
min < med < max:
0 < 0.1 < 2.1
IQR (CV) : 0.2 (1)
495 distinct values 1842
(32.4%)
47 PHENYLALANINE
[numeric]
Mean (sd) : 0.5 (0.5)
min < med < max:
0 < 0.5 < 5.2
IQR (CV) : 0.7 (0.9)
1274 distinct values 1782
(31.3%)
48 TYROSINE
[numeric]
Mean (sd) : 0.4 (0.4)
min < med < max:
0 < 0.3 < 3.3
IQR (CV) : 0.6 (0.9)
1131 distinct values 1811
(31.8%)
49 VALINE
[numeric]
Mean (sd) : 0.6 (0.6)
min < med < max:
0 < 0.4 < 6.2
IQR (CV) : 0.9 (0.9)
1428 distinct values 1778
(31.2%)
50 ARGININE
[numeric]
Mean (sd) : 0.8 (0.8)
min < med < max:
0 < 0.5 < 7.4
IQR (CV) : 1.2 (1)
1626 distinct values 1791
(31.5%)
51 HISTIDINE
[numeric]
Mean (sd) : 0.4 (0.4)
min < med < max:
0 < 0.2 < 2.3
IQR (CV) : 0.6 (1)
1075 distinct values 1784
(31.4%)
52 ALANINE
[numeric]
Mean (sd) : 0.7 (0.7)
min < med < max:
0 < 0.4 < 8
IQR (CV) : 1 (1)
1489 distinct values 1836
(32.3%)
53 ASPARTIC ACID
[numeric]
Mean (sd) : 1.2 (1.1)
min < med < max:
0 < 0.8 < 10.2
IQR (CV) : 1.7 (0.9)
1936 distinct values 1850
(32.5%)
54 GLUTAMIC ACID
[numeric]
Mean (sd) : 2.4 (12.3)
min < med < max:
0 < 1.9 < 757
IQR (CV) : 2.9 (5.2)
2432 distinct values 1833
(32.2%)
55 GLYCINE
[numeric]
Mean (sd) : 0.6 (0.7)
min < med < max:
0 < 0.4 < 19
IQR (CV) : 1 (1.1)
1438 distinct values 1835
(32.2%)
56 PROLINE
[numeric]
Mean (sd) : 0.7 (0.6)
min < med < max:
0 < 0.6 < 12.3
IQR (CV) : 0.8 (1)
1419 distinct values 1843
(32.4%)
57 SERINE
[numeric]
Mean (sd) : 0.5 (0.5)
min < med < max:
0 < 0.5 < 6.1
IQR (CV) : 0.7 (0.9)
1288 distinct values 1844
(32.4%)
58 CHOLESTEROL
[numeric]
Mean (sd) : 41.5 (138)
min < med < max:
0 < 1 < 3100
IQR (CV) : 61 (3.3)
291 distinct values 194
(3.4%)
59 FATTY ACIDS, TRANS, TOTAL
[numeric]
Mean (sd) : 0.3 (1.7)
min < med < max:
0 < 0 < 37.6
IQR (CV) : 0.2 (5.9)
498 distinct values 3559
(62.5%)
60 FATTY ACIDS, SATURATED, TOTAL
[numeric]
Mean (sd) : 3.1 (5.8)
min < med < max:
0 < 1.1 < 95.6
IQR (CV) : 3.5 (1.9)
2812 distinct values 238
(4.2%)
61 FATTY ACIDS, SATURATED, 8:0, OCTANOIC
[numeric]
Mean (sd) : 0 (0.2)
min < med < max:
0 < 0 < 7.5
IQR (CV) : 0 (7.1)
266 distinct values 1668
(29.3%)
62 FATTY ACIDS, SATURATED, 10:0, DECANOIC
[numeric]
Mean (sd) : 0 (0.2)
min < med < max:
0 < 0 < 6
IQR (CV) : 0 (5.1)
350 distinct values 1364
(24.0%)
63 FATTY ACIDS, SATURATED, 12:0, DODECANOIC
[numeric]
Mean (sd) : 0.2 (1.7)
min < med < max:
0 < 0 < 47
IQR (CV) : 0 (8.7)
448 distinct values 1201
(21.1%)
64 FATTY ACIDS, SATURATED, 14:0, TETRADECANOIC
[numeric]
Mean (sd) : 0.2 (0.9)
min < med < max:
0 < 0 < 22.8
IQR (CV) : 0.2 (3.5)
787 distinct values 788
(13.8%)
65 FATTY ACIDS, SATURATED, 16:0, HEXADECANOIC
[numeric]
Mean (sd) : 1.7 (2.8)
min < med < max:
0 < 0.7 < 43.5
IQR (CV) : 2.1 (1.7)
2322 distinct values 602
(10.6%)
66 FATTY ACIDS, SATURATED, 18:0, OCTADECANOIC
[numeric]
Mean (sd) : 0.8 (1.7)
min < med < max:
0 < 0.3 < 33.2
IQR (CV) : 0.9 (2)
1675 distinct values 615
(10.8%)
67 FATTY ACIDS, MONOUNSATURATED, 18:1undifferentiated, OCTADECENOIC
[numeric]
Mean (sd) : 3.5 (7.2)
min < med < max:
0 < 1 < 82.6
IQR (CV) : 4 (2)
2708 distinct values 578
(10.2%)
68 FATTY ACIDS, POLYUNSATURATED, 18:2undifferentiated, LINOLEIC, OCTADECADIENOIC
[numeric]
Mean (sd) : 1.8 (4.7)
min < med < max:
0 < 0.4 < 74.6
IQR (CV) : 1.5 (2.6)
2079 distinct values 561
(9.9%)
69 FATTY ACIDS, POLYUNSATURATED, 18:3undifferentiated, LINOLENIC, OCTADECATRIENOIC
[numeric]
Mean (sd) : 0.2 (1.2)
min < med < max:
0 < 0.1 < 53.4
IQR (CV) : 0.1 (6)
689 distinct values 656
(11.5%)
70 FATTY ACIDS, POLYUNSATURATED, 20:4, EICOSATETRAENOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 1.8
IQR (CV) : 0 (2.6)
261 distinct values 1210
(21.3%)
71 FATTY ACIDS, POLYUNSATURATED, 22:6 n-3, DOCOSAHEXAENOIC (DHA)
[numeric]
Mean (sd) : 0 (0.5)
min < med < max:
0 < 0 < 18.2
IQR (CV) : 0 (9.6)
296 distinct values 137
(2.4%)
72 FATTY ACIDS, MONOUNSATURATED, 16:1undifferentiated, HEXADECENOIC
[numeric]
Mean (sd) : 0.2 (1)
min < med < max:
0 < 0 < 18.9
IQR (CV) : 0.2 (4)
770 distinct values 836
(14.7%)
73 FATTY ACIDS, POLYUNSATURATED, 18:4, OCTADECATETRAENOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 3
IQR (CV) : 0 (10.2)
126 distinct values 1781
(31.3%)
74 FATTY ACIDS, POLYUNSATURATED, 20:5 n-3, EICOSAPENTAENOIC (EPA)
[numeric]
Mean (sd) : 0 (0.4)
min < med < max:
0 < 0 < 13.2
IQR (CV) : 0 (9.7)
259 distinct values 1306
(23.0%)
75 FATTY ACIDS, MONOUNSATURATED, 22:1undifferentiated, DOCOSENOIC
[numeric]
Mean (sd) : 0 (0.8)
min < med < max:
0 < 0 < 41.2
IQR (CV) : 0 (17.3)
199 distinct values 1532
(26.9%)
76 FATTY ACIDS, POLYUNSATURATED, 22:5 n-3, DOCOSAPENTAENOIC (DPA)
[numeric]
Mean (sd) : 0 (0.2)
min < med < max:
0 < 0 < 5.6
IQR (CV) : 0 (11.1)
162 distinct values 149
(2.6%)
77 FATTY ACIDS, MONOUNSATURATED, TOTAL
[numeric]
Mean (sd) : 3.9 (7.8)
min < med < max:
0 < 1.2 < 83.7
IQR (CV) : 4.5 (2)
2880 distinct values 314
(5.5%)
78 FATTY ACIDS, POLYUNSATURATED, TOTAL
[numeric]
Mean (sd) : 2.2 (5.2)
min < med < max:
0 < 0.6 < 74.6
IQR (CV) : 1.8 (2.4)
2381 distinct values 316
(5.6%)
79 NATURALLY OCCURRING FOLATE
[numeric]
Mean (sd) : 29.2 (75.3)
min < med < max:
0 < 9 < 2340
IQR (CV) : 20 (2.6)
261 distinct values 503
(8.8%)
80 RETINOL ACTIVITY EQUIVALENTS
[numeric]
Mean (sd) : 115 (836)
min < med < max:
0 < 3 < 30000
IQR (CV) : 33 (7.3)
463 distinct values 260
(4.6%)
81 DIETARY FOLATE EQUIVALENTS
[numeric]
Mean (sd) : 44.4 (119)
min < med < max:
0 < 12 < 5881
IQR (CV) : 41 (2.7)
332 distinct values 499
(8.8%)
82 FATTY ACIDS, POLYUNSATURATED, 18:2 c,c n-6, LINOLEIC, OCTADECADIENOIC
[numeric]
Mean (sd) : 2.3 (6)
min < med < max:
0 < 0.5 < 74.6
IQR (CV) : 1.5 (2.6)
1285 distinct values 3256
(57.2%)
83 FATTY ACIDS, POLYUNSATURATED, 20:3, EICOSATRIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 1.4
IQR (CV) : 0 (7.1)
89 distinct values 2322
(40.8%)
84 FATTY ACIDS, POLYUNSATURATED, 18:3 c,c,c n-3 LINOLENIC, OCTADECATRIENOIC
[numeric]
Mean (sd) : 0.2 (1.3)
min < med < max:
0 < 0 < 53.4
IQR (CV) : 0.1 (6.2)
619 distinct values 954
(16.8%)
85 FATTY ACIDS, POLYUNSATURATED, 18:3 c,c,c n-6, g-LINOLENIC, OCTADECATRIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 1
IQR (CV) : 0 (18.1)
49 distinct values 307
(5.4%)
86 BETA CRYPTOXANTHIN
[numeric]
Mean (sd) : 15.2 (198)
min < med < max:
0 < 0 < 6252
IQR (CV) : 0 (13)
128 distinct values 2334
(41.0%)
87 LYCOPENE
[numeric]
Mean (sd) : 220 (1807)
min < med < max:
0 < 0 < 46260
IQR (CV) : 0 (8.2)
190 distinct values 2324
(40.8%)
88 LUTEIN AND ZEAXANTHIN
[numeric]
Mean (sd) : 260 (1387)
min < med < max:
0 < 0 < 19697
IQR (CV) : 39 (5.3)
419 distinct values 2346
(41.2%)
89 FATTY ACIDS, POLYUNSATURATED, 20:3 n-6, EICOSATRIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 1.4
IQR (CV) : 0 (14.6)
71 distinct values 521
(9.2%)
90 FATTY ACIDS, POLYUNSATURATED, 20:4 n-6, ARACHIDONIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 1.8
IQR (CV) : 0 (2.7)
227 distinct values 2625
(46.1%)
91 FATTY ACIDS, POLYUNSATURATED, 20:3 n-3 EICOSATRIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 1
IQR (CV) : 0 (16.7)
52 distinct values 505
(8.9%)
92 VITAMIN B12, ADDED
[numeric]
Mean (sd) : 1 (17.5)
min < med < max:
0 < 0 < 380
IQR (CV) : 0 (18.1)
28 distinct values 5218
(91.7%)
93 ALPHA-TOCOPHEROL, ADDED
[numeric]
Mean (sd) : 0.1 (0.9)
min < med < max:
0 < 0 < 16.9
IQR (CV) : 0 (12)
11 distinct values 5231
(91.9%)
94 VITAMIN D2, ERGOCALCIFEROL
[numeric]
Mean (sd) : 0.3 (2)
min < med < max:
0 < 0 < 28.1
IQR (CV) : 0 (6.3)
22 distinct values 5344
(93.9%)
95 FATTY ACIDS, SATURATED, 4:0, BUTANOIC
[numeric]
Mean (sd) : 0 (0.2)
min < med < max:
0 < 0 < 3.2
IQR (CV) : 0 (5)
274 distinct values 1839
(32.3%)
96 FATTY ACIDS, SATURATED, 6:0, HEXANOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 2
IQR (CV) : 0 (4.9)
224 distinct values 1816
(31.9%)
97 ALPHA CAROTENE
[numeric]
Mean (sd) : 40.8 (387)
min < med < max:
0 < 0 < 14251
IQR (CV) : 0 (9.5)
164 distinct values 2340
(41.1%)
98 FATTY ACIDS, MONOUNSATURATED, 22:1c, DOCOSENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 1.1
IQR (CV) : 0 (5.7)
100 distinct values 2912
(51.2%)
99 FATTY ACIDS, POLYUNSATURATED, 18:3i, LINOLENIC, OCTADECATRIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.3
IQR (CV) : 0 (4.7)
54 distinct values 4419
(77.7%)
100 FATTY ACIDS, MONOUNSATURATED, 22:1t, DOCOSENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.1
IQR (CV) : 0 (18.5)
16 distinct values 2983
(52.4%)
101 SUCROSE
[numeric]
Mean (sd) : 2 (7.3)
min < med < max:
0 < 0 < 99.8
IQR (CV) : 0.4 (3.7)
487 distinct values 3044
(53.5%)
102 GLUCOSE
[numeric]
Mean (sd) : 0.8 (2.5)
min < med < max:
0 < 0 < 35.8
IQR (CV) : 0.5 (3.2)
399 distinct values 3051
(53.6%)
103 FRUCTOSE
[numeric]
Mean (sd) : 0.7 (2.5)
min < med < max:
0 < 0 < 55.6
IQR (CV) : 0.3 (3.6)
387 distinct values 3055
(53.7%)
104 LACTOSE
[numeric]
Mean (sd) : 0.3 (1.2)
min < med < max:
0 < 0 < 13.2
IQR (CV) : 0 (4)
225 distinct values 3076
(54.1%)
105 MALTOSE
[numeric]
Mean (sd) : 0.2 (0.8)
min < med < max:
0 < 0 < 16.4
IQR (CV) : 0 (3.9)
217 distinct values 3098
(54.4%)
106 GALACTOSE
[numeric]
Mean (sd) : 0 (0.5)
min < med < max:
0 < 0 < 19.9
IQR (CV) : 0 (14.1)
53 distinct values 3122
(54.9%)
107 FATTY ACIDS, SATURATED, 20:0, EICOSANOIC
[numeric]
Mean (sd) : 0 (0.2)
min < med < max:
0 < 0 < 4.6
IQR (CV) : 0 (4.6)
183 distinct values 3649
(64.1%)
108 FATTY ACIDS, SATURATED, 22:0, DOCOSANOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 3.7
IQR (CV) : 0 (5.8)
133 distinct values 3691
(64.9%)
109 FATTY ACIDS, MONOUNSATURATED, 14:1, TETRADECENOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 1.8
IQR (CV) : 0 (4)
156 distinct values 3666
(64.4%)
110 FATTY ACIDS, MONOUNSATURATED, 20:1, EICOSENOIC
[numeric]
Mean (sd) : 0.1 (0.6)
min < med < max:
0 < 0 < 15
IQR (CV) : 0 (6.3)
365 distinct values 1759
(30.9%)
111 FATTY ACIDS, SATURATED, 15:0, PENTADECANOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.9
IQR (CV) : 0 (2.9)
121 distinct values 3772
(66.3%)
112 FATTY ACIDS, SATURATED, 17:0, HEPTADECANOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 0.8
IQR (CV) : 0 (2)
189 distinct values 3723
(65.4%)
113 FATTY ACIDS, SATURATED, 24:0, TETRACOSANOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 4.7
IQR (CV) : 0 (9.4)
91 distinct values 3949
(69.4%)
114 STARCH
[numeric]
Mean (sd) : 4 (11.4)
min < med < max:
0 < 0 < 73.3
IQR (CV) : 0 (2.9)
360 distinct values 3755
(66.0%)
115 BETA-TOCOPHEROL
[numeric]
Mean (sd) : 0.1 (0.5)
min < med < max:
0 < 0 < 10.5
IQR (CV) : 0.1 (5.1)
65 distinct values 4929
(86.6%)
116 GAMMA-TOCOPHEROL
[numeric]
Mean (sd) : 2.3 (5.7)
min < med < max:
0 < 0.2 < 65.2
IQR (CV) : 1.7 (2.5)
274 distinct values 4922
(86.5%)
117 DELTA-TOCOPHEROL
[numeric]
Mean (sd) : 0.4 (1.3)
min < med < max:
0 < 0 < 15.4
IQR (CV) : 0.2 (3.2)
148 distinct values 4928
(86.6%)
118 FATTY ACIDS, MONOUNSATURATED, 16:1t, HEXADECENOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 6.1
IQR (CV) : 0 (19.6)
73 distinct values 3959
(69.6%)
119 FATTY ACIDS, MONOUNSATURATED, 18:1t, OCTADECENOIC
[numeric]
Mean (sd) : 0.1 (0.7)
min < med < max:
0 < 0 < 20.2
IQR (CV) : 0.1 (5.6)
295 distinct values 4118
(72.4%)
120 FATTY ACIDS, POLYUNSATURATED, 18:2i, LINOLEIC, OCTADECADIENOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 2.3
IQR (CV) : 0 (3.6)
140 distinct values 4332
(76.1%)
121 FATTY ACIDS, MONOUNSATURATED, 24:1c, TETRACOSENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.6
IQR (CV) : 0 (7.9)
45 distinct values 4148
(72.9%)
122 FATTY ACIDS, MONOUNSATURATED, 16:1c, HEXADECENOIC
[numeric]
Mean (sd) : 0.1 (0.3)
min < med < max:
0 < 0 < 6.9
IQR (CV) : 0.1 (2.4)
396 distinct values 3923
(68.9%)
123 FATTY ACIDS, POLYUNSATURATED, 20:2 c,c EICOSADIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.7
IQR (CV) : 0 (3.1)
128 distinct values 3854
(67.7%)
124 FATTY ACIDS, MONOUNSATURATED, 18:1c, OCTADECENOIC
[numeric]
Mean (sd) : 4.7 (72.2)
min < med < max:
0 < 1.1 < 2845
IQR (CV) : 3.2 (15.5)
1066 distinct values 4132
(72.6%)
125 FATTY ACIDS, MONOUNSATURATED, 17:1, HEPTADECENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 1.1
IQR (CV) : 0 (2.7)
135 distinct values 3903
(68.6%)
126 FATTY ACIDS, TOTAL TRANS-MONOENOIC
[numeric]
Mean (sd) : 0.1 (0.7)
min < med < max:
0 < 0 < 20.2
IQR (CV) : 0.1 (6.2)
285 distinct values 4249
(74.7%)
127 FATTY ACIDS, MONOUNSATURATED, 15:1, PENTADECENOIC
[numeric]
Mean (sd) : 0 (0.2)
min < med < max:
0 < 0 < 6
IQR (CV) : 0 (28)
27 distinct values 4050
(71.2%)
128 FATTY ACIDS, POLYUNSATURATED, CONJUGATED, 18:2 cla, LINOLEIC, OCTADECADIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 1.1
IQR (CV) : 0 (4.4)
90 distinct values 4331
(76.1%)
129 FATTY ACIDS, POLYUNSATURATED, 22:4 n-6, DOCOSATETRAENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.3
IQR (CV) : 0 (3.1)
66 distinct values 4229
(74.3%)
130 FATTY ACIDS, TOTAL TRANS-POLYENOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 2.5
IQR (CV) : 0 (3.8)
154 distinct values 4330
(76.1%)
131 CHOLINE, TOTAL
[numeric]
Mean (sd) : 39.1 (70.6)
min < med < max:
0 < 19 < 2403
IQR (CV) : 50.5 (1.8)
910 distinct values 2813
(49.4%)
132 BETAINE
[numeric]
Mean (sd) : 10.6 (31.5)
min < med < max:
0 < 3.9 < 630
IQR (CV) : 10.1 (3)
257 distinct values 4589
(80.7%)
133 FATTY ACIDS, POLYUNSATURATED, TOTAL OMEGA N-3
[numeric]
Mean (sd) : 0.5 (2.4)
min < med < max:
0 < 0.1 < 53.4
IQR (CV) : 0.2 (4.9)
548 distinct values 3717
(65.3%)
134 FATTY ACIDS, POLYUNSATURATED, TOTAL OMEGA N-6
[numeric]
Mean (sd) : 3.1 (23.3)
min < med < max:
0 < 0.5 < 953
IQR (CV) : 1.4 (7.6)
1055 distinct values 3711
(65.2%)
135 ASPARTAME
[numeric]
Mean (sd) : 51.1 (403)
min < med < max:
0 < 0 < 3722
IQR (CV) : 0 (7.9)
0 : 82 (94.3%)
37 : 1 ( 1.1%)
42 : 1 ( 1.1%)
52 : 1 ( 1.1%)
597 : 1 ( 1.1%)
3722 : 1 ( 1.1%)
5603
(98.5%)
136 TOTAL PLANT STEROL
[numeric]
Mean (sd) : 26.4 (80.7)
min < med < max:
0 < 0 < 1190
IQR (CV) : 14 (3.1)
117 distinct values 4995
(87.8%)
137 MANNITOL
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.2
IQR (CV) : 0 (17.6)
3 distinct values 4313
(75.8%)
138 SORBITOL
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 2.3
IQR (CV) : 0 (14)
9 distinct values 4304
(75.6%)
139 STIGMASTEROL
[numeric]
Mean (sd) : 1.3 (5.9)
min < med < max:
0 < 0 < 59
IQR (CV) : 0 (4.6)
26 distinct values 5183
(91.1%)
140 TOTAL MONOSACCARIDES
[numeric]
Mean (sd) : 0.8 (2.7)
min < med < max:
0 < 0 < 30.6
IQR (CV) : 0.1 (3.3)
267 distinct values 3810
(67.0%)
141 TOTAL DISACCHARIDES
[numeric]
Mean (sd) : 1.5 (4.8)
min < med < max:
0 < 0 < 47.2
IQR (CV) : 0 (3.3)
295 distinct values 3824
(67.2%)
142 BETA-SITOSTEROL
[numeric]
Mean (sd) : 14 (47.1)
min < med < max:
0 < 0 < 621
IQR (CV) : 0 (3.4)
54 distinct values 5187
(91.2%)
143 HYDROXYPROLINE
[numeric]
Mean (sd) : 0.1 (0.1)
min < med < max:
0 < 0 < 0.7
IQR (CV) : 0.2 (1.3)
197 distinct values 5083
(89.3%)
144 FATTY ACIDS, SATURATED, 13:0 TRIDECANOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.1
IQR (CV) : 0 (10.7)
10 distinct values 5241
(92.1%)
145 FATTY ACIDS, POLYUNSATURATED, 21:5
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.4
IQR (CV) : 0 (11.5)
12 distinct values 4734
(83.2%)
146 FATTY ACIDS, MONOUNSATURATED, 24:1undifferentiated, TETRACOSENOIC
[numeric]
Mean (sd) : 0 (0.1)
min < med < max:
0 < 0 < 3
IQR (CV) : 0 (19.8)
32 distinct values 4550
(80.0%)
147 FATTY ACIDS, MONOUNSATURATED, 12:1, LAUROLEIC
[numeric]
1 distinct value 0 : 351 (100.0%) 5339
(93.8%)
148 FATTY ACIDS, POLYUNSATURATED, 22:3,
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.1
IQR (CV) : 0 (13.6)
10 distinct values 4754
(83.6%)
149 FATTY ACIDS, POLYUNSATURATED, 22:2, DOCOSADIENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0
IQR (CV) : 0 (15.5)
4 distinct values 4690
(82.4%)
150 FATTY ACIDS, POLYUNSATURATED, 18:2t,t , OCTADECADIENENOIC
[numeric]
Mean (sd) : 0 (0)
min < med < max:
0 < 0 < 0.5
IQR (CV) : 0 (5.9)
59 distinct values 4622
(81.2%)
151 CAMPESTEROL
[numeric]
Mean (sd) : 3.8 (16)
min < med < max:
0 < 0 < 189
IQR (CV) : 0 (4.2)
27 distinct values 5400
(94.9%)
152 BIOTIN
[numeric]
Mean (sd) : 6.1 (6.5)
min < med < max:
0 < 3.5 < 31.6
IQR (CV) : 7.2 (1.1)
71 distinct values 5585
(98.2%)
153 NA
[numeric]
1 distinct value 1 distinct values 5689
(100.0%)
154 OXALIC ACID
[numeric]
Mean (sd) : 0.3 (0.4)
min < med < max:
0 < 0.1 < 1.7
IQR (CV) : 0.3 (1.4)
27 distinct values 5639
(99.1%)

Missing values in clean dataset

Citations for packages used

Analyses were conducted using the R Statistical language (version 3.6.0; R Core Team, 2019) on macOS Mojave 10.14.6, using the packages GGally (version 2.0.0; Barret Schloerke et al., 2020), gtsummary (version 1.3.6; Daniel Sjoberg et al., 2021), summarytools (version 0.9.8; Dominic Comtois, 2020), Matrix (version 1.2.17; Douglas Bates and Martin Maechler, 2019), RColorBrewer (version 1.1.2; Erich Neuwirth, 2014), ggplot2 (version 3.3.3; Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.), tidyverse (version 1.2.1; Hadley Wickham, 2017), stringr (version 1.4.0; Hadley Wickham, 2019), tidyr (version 1.1.2; Hadley Wickham, 2020), forcats (version 0.5.1; Hadley Wickham, 2021), readr (version 1.3.1; Hadley Wickham, Jim Hester and Romain Francois, 2018), dplyr (version 1.0.2; Hadley Wickham et al., 2020), stargazer (version 5.2.2; Hlavac, Marek, 2018), wordcloud (version 2.6; Ian Fellows, 2018), tm (version 0.7.8; Ingo Feinerer and Kurt Hornik, 2020), glmnet (version 4.1.1; Jerome Friedman et al., 2010), car (version 3.0.3; John Fox and Sanford Weisberg, 2019), carData (version 3.0.2; John Fox, Sanford Weisberg and Brad Price, 2018), here (version 1.0.1; Kirill Müller, 2020), tibble (version 3.1.0; Kirill Müller and Hadley Wickham, 2021), NLP (version 0.2.1; Kurt Hornik, 2020), purrr (version 0.3.4; Lionel Henry and Hadley Wickham, 2020), sjPlot (version 2.8.6; Lüdecke D, 2020), report (version 0.3.0; Makowski et al., 2020), data.table (version 1.12.2; Matt Dowle and Arun Srinivasan, 2019), varhandle (version 2.0.5; Mehrad Mahmoudian, 2020), SnowballC (version 0.7.0; Milan Bouchet-Valat, 2020), pacman (version 0.5.1; Rinker et al., 2017), corrplot (version 0.84; Taiyun Wei and Viliam Simko, 2017) and pROC (version 1.17.0.1; Xavier Robin et al., 2011).

References

References